Modeling speaking rate for voice fonts

نویسندگان

  • Ashish Verma
  • Arun Kumar
چکیده

Voice fonts are created and stored for a speaker, to be used to synthesize speech in the speaker’s voice. The most important descriptors of voice fonts are spectral envelope for acoustic units and prosodic features such as fundamental frequency and average speaking rate. In this paper, we present a new approach to model the speaking rate so that it can be easily incorporated in voice fonts and used for personality transformation. We model speaking rate in the form of average duration for various acoustic units and categories for the speaker. The speaking rate can be automatically extracted from a speech corpus in the speaker’s voice using the proposed approach. We show how the proposed approach can be implemented, and present its performance evaluation through various subjective tests.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling perceived vocal age in american English

An acoustic analysis of voice, articulatory, and prosodic cues to perceived age was completed for a speech database of 150 American English speakers. Perceived ages were submitted to multiple linear regression analyses with measures of acoustic correlates of: voice quality, articulation, fundamental frequency, and prosody. The fit between predicted and actual perceived ages from the resulting m...

متن کامل

Alleviating Iranian EFL Students’ Speaking Anxiety: Mobile-assisted instruction vs. traditional instruction

Language classrooms are occasionally anxiety-breeding situations. Foreign language classroom anxiety which negatively affects foreign language learning is typically associated with productive activities mainly speaking skill. To cope with the issue and overcome language learning difficulties, the present study was conducted to explore the impact of mobile-assisted language learning on enhancing...

متن کامل

Perceptual normalization for speaking rate III: Effects of the rate of one voice on perception of another

Individuals vary their speaking rate, and listeners use the speaking rate of precursor sentences to adjust for these changes (Kidd, 1989). Most of the research on this adjustment process has focused on situations in which there was only a single stream of speech over which such perceptual adjustment could occur. Yet listeners are often faced with environments in which multiple people are speaki...

متن کامل

Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices

Voice access of cloud applications including social networks using mobile devices becomes attractive today. And personalized speech recognizers over mobile devices become feasible because most mobile devices have only a single user. Speaking rate variation is known to be an important source of performance degradation for spontaneous speech recognition. Speaking rate is speaker dependent, it cha...

متن کامل

مطالعۀ ارگونومیک پارامترهای تایپوگرافی در قلم های نوشتاری فارسی

Abstract Introduction The extensive development of written interactions in the current world of technology in one hand, and on the other hand noticeable dominance of English language in this milieu, has led to inadequate utilization of Farsi in such settings, even amongst native speakers. Lack of experimental data regarding legibility and readability of the printed and electronic texts related ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003